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ABSTRACT 

This paper describes some low cost approaches to 
coherent BPSK demodulation for mobile satellite re- 
ceivers. The specific application is an Inmarsat-C Land 
Mobile Earth Station (LMES), but the techniques are 
applicable to any PSK demodulator. The techniques 
discussed include combined sampling and quadrature 
downconversion with a single A/D, and novel DSP al- 
gorithms for carrier acquisition offering both superior 
performance and economy of DSP resources. The DSP 
algorithms run at 5.7 MIPS and the entire DSP subsys- 
tem, built with commercially available parts, costs un- 
der $60 at quantity-10,000. 

INTRODUCTION 

Low cost mobile terminals are essential to the com- 
mercial success of many of the mobile satellite services 
(MSS) being launched today. This is because the ser- 
vices are based on the premise of a mass market, whose 
penetration will be critically dependent on the terminal 
cost. Driven by the desire to also minimize the space 
segment cost, some mobile satellite service providers, 
such as Inmarsat in its "C standard, have opted for co- 
herent rather than differential PSK modulation. Co- 
herent demodulation is more complex than differential 
detection, the more popular approach, and tends to in- 
crease the terminal cost. 

Modern PSK demodulators are almost invariably 
implemented by DSP. However, it is not widely recog- 
nized that, per today’s pricing of programmable DSP 
chips, the cost of the DSP solution goes up quite rapidly 
with the processing speed (millions of instructions per 
second, or MIPS) and the on-chip memory. Floating 
point chips also extract a premium over fixed point de- 
vices. Table 1 shows the MIPS, internal memory and 
unit cost at quantity- 10,000 for some popular pro- 
grammable DSP chips. 

It is clear from Table 1 that there is considerable in- 
centive to engineer the DSP alogrithms for minimum 
MIPS and on-chip memory, while keeping the perfor- 
mance acceptable. In the present case, "acceptable 
performance" was that defined in the Inmarsat-C Sys- 
tem Definition Manual (SDM). 


Table 1. Comparison of Low Cost DSP Chips 


Vendor 

Product 

Type 

Cycle 

Spaed 

(MHi) 

On-Chip 
Prog. Mem. 

On-Chip 
Data Mem. 

Unit Cost 
© qty.lOK 

(5) 

Analog 

Devices 

ADSP-2105 

16-bit 
Fixed PL 

10.0 

1Kx24 

(RAM) 

512x16 

(RAM) 

20.00 

ADSP-21 01 

16-bit 
Fixed PL 

12.5 

2Kx?4 

(RAM) 

IKx 16 
(RAM) 

38.00 

ADSP-21010 

32-bit 
FI. PL 

12.5 

externa) 

external 

49.88 

Texas 

Instr. 

TMS320C25 

16-bit 
Fixed Pt 

12.8 

4Kx 16 * 

(ROM) 

544x16 

(RAM) 

14.86 

TMS320C51 

16-bit 
Fixed Pt 

28.5 

8Kx 16 
(ROM) 
IKxl 
(dual 

iKxie 

(RAM) 

16 

RAM) 

28.89 

TMS320 

C31-27 

32-bit 

FLPt 

13.5 

4Kx32 
(dual ROM) 

2Kx32 
(dual ROM) 

40.00 


Prices are for the Industrial temperature range (-25 C to +85 C), 
except far TMS320C51 and TMS32QC31 -27, which are currently 
not available in that temperature range. 

(The Industrial temperature range is preferred for land mobile services} 

In this paper, the techniques discussed are (1) si- 
multaneous bandpass sampling and quadrature down- 
conversion using a single A/D, and (2) a novel DSP al- 
gorithm for carrier acquisition, with features suitable 
for MSS. Other DSP innovations are also featured in 
the Rockwell Inmarsat-C terminal but are not discussed 
here for lack of space and proprietary reasons. 

Demodulator Hardware Architecture 

Figure 1 shows functions performed in DSP in the 
Rockwell LMES. 



Status-output/Control-ln put 
(ink to Control Processor 


Figure 1. Demodulator Hardware Architecture 
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The received signal at the IF of 450 kHz is simulta- 
neously sampled and quadrature downconverted to the 
nominal frequency of 0 Hz by a single 8-bit A/D. 
Thereafter, the complex samples are processed in the 
DSP chip to perform the functions of carrier frequency 
and phase estimation, symbol synchronization, BPSK 
matched filtering, frame deinterleaving, UW detection 
and tracking, erasure sensing and Viterbi decoding. Of 
these, only carrier frequency and phase estimation are 
discussed here. 

INPUT SAMPLING/ DOWNCONVERSION 

Subharmonic quadrature sampling was the tech- 
nique used. This combined the processes of input sam- 
pling and quadrature downconversion, leading to a sig- 
nificant reduction or simplification of the signal condi- 
tioning circuitry preceding the A/D. We first discuss 
two conventional approaches and then describe the ap- 
proach used. 

Conventional Approach A 

Probably the most conventional approach is to mul- 
tiply (mix) the IF signal with two quadrature-phase lo- 
cal oscillator (LO) signals at IF. The mixer outputs are 
lowpass filtered and digitized by "slow" A/D converters. 
Figure 2 shows a block diagram of this approach. 


LOWPASS SLOW 

FILTER A/D 
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Figure 2. Conventional I/Q Downconversion and 
Input Sampling 

The required operating bandwidth of these A/D’s is 
the sampling rate, which, according to the Nyquist theo- 
rem, is lower bounded by the width of the receiver’s 
stopband. In the present design, the stopband width is 
approximately 6 kHz. 

For ideal circuit components, this yields mathemati- 
cally perfect I and Q samples. However, because of the 
separate analog paths of the I and Q signals, the possi- 
bility of gain and phase mismatch exists. Selecting com- 
ponents with good match increases the terminal cost. 
There is also the problem of DC bias if active mixers 
are used. The two-path sampling approach also re- 
quires many more discrete components than a single- 
path approach, as used in Conventional Approach B 
and the present implementation. 


Conventional Approach B 

Another approach common in today’s digital wireless 
modems is single-path sub-Nyquist sampling with the 
complex downconversion in DSP. In this approach, the 
input sampling rate has to be at least 2-times the IF 
stopband width. After digital Hilbert transformation, 
the sampling rate can be decimated to a frequency 
equal to the stopband width. 

Implemented Approach 

The bandpass IF signal at 450 kHz was sampled by 
pairs of pulses, with an intra-pair time separation of 
1/(4.IF) = 1/(1.8 MHz) and inter-pair separation of 
1/(6 kHz). Figure 3 explains the concept. 


1/IF = 1/450 kHz 



Figure 3. Concept: Combined I/Q Downconversion 
and Input Sampling 

The process may be thought of as sampling with the 
complex sampling function, (l+jl)<5(t-nT). The90-de- 
gree phase shift between the sampling pulses in each 
pair is achieved by time staggering. As in Conventional 
Approach B, only a single A/D is required. Moreover, 
the input rate is 6 kHz, unlike Conventional Approach 
B, in which the rate would be 12 kHz. Figure 4 shows 
the hardware block diagram. 
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Figure 4. Hardware Block Diagram: Combined I/Q 
Downconversion and Input Sampling 

The economy of this approach, both in sampling cir- 
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cuitry and DSP code is obvious. Compared to Conven- 
tional Approach A, the advantages are in circuit 
matching requirements and low parts count. Compared 
to Conventional Approach B, it avoids the digital 
Hilbert transformer and runs at a lower input sampling 
rate. 

Although economical in hardware and immune to 
mismatch errors, this approach does have its own im- 
perfection. However, the imperfection is acceptable 
here because of the noisy input signal. The input sam- 
ples are at perfect phase quadrature only at the nominal 
IF (450 kHz). If the band of interest (stopband) is 
small compared to the nominal IF, the phase error rel- 
ative to quadrature, even at the band edge, is small. 

For example, in the present design, the stopband is ap- 
proximately + /-3 kHz. The phase error for this fre- 
quency is 9 = 4 * /-0.6 degrees. It can be shown that this 
creates a cochannel self-interference term that is at 
2Ologio{sin(0)} relative to the desired signal, i.e ap- 
proximately -40 dBc. The fact that the modem operates 
in Es/No of typically 3.7 dB makes this level of self-in- 
terference quite acceptable. 

The present approach also requires a faster A/D 
than the conventional approaches. In the present de- 
sign, an operational bandwidth of 4.IF - 1.8 MHz is 
required, as opposed to a bandwidth of 6 kHz in Con- 
ventional Approach A and 12 kHz in Conventional Ap- 
proach B respectively. However, low cost flash A/D’s 
in the low MHz range are now available, making this 
approach a better choice from a cost standpoint. The 
cost of the A/D used, in quantity-lOK, was $6.75. 

CARRIER ACQUISITION ALGORITHMS 

Demodulator Requirements 

The detailed performance requirements are given in 
the SDM [1] and are not repeated here. However, the 
key challenges are highlighted. 

Transmit Signal Characteristics 

Modulation: Unfiltered BPSK 

Coding: Rate-1/2 Convolutional 

Symbol rate: 1200 bps 

Fading Channel Characterisitics 

Unfaded C/N 0 : 34.0 dBHz 

Fading Type: Rician, C/M = 7 dB 

Fading bw.: 0.7 Hz 

Blocked Channel Characteristics 

Unblocked C/N 0 : 35.0 dBHz 

Duration: 2.7 s 

Period: 8.9 s 

Doppler Characteristics 

Max. Shift: + /- 850 Hz 

V ariation Rate: + /- 10 Hz/s 

In [1], the performance requirements are specified in 
terms of the Packet Error Rate (PER) for the fading 
channel and the blocked channel separately. Perfor- 
mance specifications for different packet sizes and input 


C/No are provided; here we dte only those for the 128 - 
byte packets and the above C/No values for illustration. 

Ty pical SDM Performance Requirements 

PER(128) fading ch.: 8% 

PER(128) blocked ch.: 10% 

When the demodulator performance requirements 
are translated into carrier acquisition requirements, the 
following fads emerge: 

(1) The low prevailing C/N 0 , while making the 
demodulation task difficult, makes it possible to use 
non-ideal processing techniques. This was exploited in 
the input sampling scheme. 

(2) Conventional phase locked loop techniques for 
carrier acquisition will not work because of the con- 
flicting requirements of large capture bandwidth (+/- 
850 Hz) and rapid acquisition on the one hand, and low 
phase noise on the other. The capture range was actu- 
ally set even higher, at +/- 1300 Hz, to enable rapid 
frequency search on power up. During the latter phase, 
the receiver is hopped in 2.5-kHz steps. Rapid carrier 
acquisition is required so that (a) the initial frequency 
search time is short, and (b) not many bits are lost 
when the LMES emerges from a blockage or transmit 
mode (the communication is half-duplex). 

Review of Carrier Recovery Techniques for BPSK De- 
modulation 

The two major problems in BPSK demodulation are 
recovery of the carrier phase and symbol clock phase. 

In this paper, we discuss only the former. 

Carrier recovery may be performed by either open 
loop or closed loop techniques. One popular open loop 
technique is to continuously operate an FFT in the 
background and use it to obtain a coarse frequency es- 
timate; this is used to aid a closed loop carrier syn- 
chronizer. An alternative approach is described by 
Viterbi [2]. Both of these techniques are much more 
complex and demanding of DSP resources than the 
closed loop techniques. It was therefore decided to im- 
plement the present demodulator based on closed loop 
techniques alone. 

A simple phase locked loop cannot be used because 
a BPSK signal has a suppressed carrier. However, 
modified phase locked loops are usable, such as the 
squaring loop and the Costas loop. Many text books, 
e.g. [3], provide extensive coverage of both techniques. 
The Costas loop has the advantage over the squaring 
loop that it is capable of wider bandwidth operation 
[ibid p.304]. Therefore, the Costas loop was chosen. 

Cahn has analyzed the performance of the Costas 
loop and shown that, in most receive applications, there 
is a conflict between the required lock-time/ capture- 
range and the acceptable level of phase noise [4], This 
is explained below. 
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The capture range of a basic Type I phase locked 
loop (without a perfect integrator in the loop filter) is 
directly proportional to the loop resonance frequency, 
and hence also the loop bandwidth [3, p.364] . As the 
loop bandwidth determines the amount of phase noise 
in the loop’s voltage controlled oscillator (VCO), it is 
clear that large capture bandwidth and low phase noise 
are conflicting requirements for a Type I loop. 

In Type II loops (loop filter has a perfect integrator), 
the capture range is theoretically unbounded. In practi- 
cal systems, it is bounded by the loop’s dynamic range. 
Thus, for Type II loops, the capture rang/e and phase 
noise (loop bandwidth) are unrelated. 

We now examine the lock time. This is given by 
Gardner for Type II loops as [5, p. 76] 

Tacq = 4.2(Af) 2 /B„ 3 (1) 

where, 

Tacq* acquisition time 

Af: frequency offset 

B n : loop noise bandwidth 

The expression for Type I loops is very similar 
[Spilker, p.364] and differs only in the multiplying con- 
stant. Irrespective of the type of loop, note that the 
lock time is inversely proportional to the cube of the 
loop bandwidth. This makes it difficult to simultane- 
ously achieve rapid phase lock during carrier search, 
and low phase noise during carrier tracking. 

Cahn proposed to overcome this problem by creating 
an outer frequency locked loop around the inner phase 
locked loop, as shown in Figure 5 (excluding the adap- 
tive AFC gain control, which is the contribution of the 
present work). 


frequency 

discriminator 



Figure 5. AFC-aided Costas Loop 

The outer loop acts as an automatic frequency con- 
trol (AFC) loop; this configuration is known as the 
"aided" phase locked (or Costas) loop. Both the outer 


and the inner loops provide error signals to the VCO 
input. However, the AFC loop’s contribution is pro- 
portional to thc frequency difference, and not the phase 
difference, relative to the input carrier. This configura- 
tion increases the capture range and reduces the lock 
time because an AFC loop can perform the task of 
pulling in carriers with large offset much better than a 
phase locked loop. However, the AFC loop also con- 
tributes noise to the VCO’s driving function. There- 
fore, die AFC loop bandwidth has to be limited so that 
its contribution to the VCO’s phase noise is small com- 
pared to that of the phase locked loop. Cahn found an 
AFC loop bandwidth of 0.1 times the bandwidth of the 
phase locked loop to be a suitable choice. 

New Carrier Recovery Scheme 

Although Cahn’s AFC loop solves the capture range 
problem and provides some help in reducing the lock 
time, the latter is still unacceptable per the present de- 
sign goals, given below. 

Carrier Recovery Design Goals 
Loop Bandwidth: ' 60 Hz 

(determine by phase noise 
and Doppler rate tracking) 

Capture Range: + /- 1300 Hz 

Mean Acquisition Time: 1 s 

In the present demodulator, the lock time was fur- 
ther reduced over Cahn by making the AFC loop gain 
adaptive. The inner loop was a Type II Costas loop 
with 60-Hz loop bandwidth, a capture range of over 100 
Hz and an acquisition time (for 100-Hz offset carriers) 
of approximately 0.8 s. The outer loop had a capture 
range of + /- 1300 Hz and an acquisition time of ap- 
proximately 1.0 s. The adaptive AFC gain control 
scheme is described below. 

The AFC gain scheme first investigated was: 

IF(NOT.(inner loop lock)) THEN 
AFC GAIN = HIGH 
ELSE 

AFC GAIN = LOW 

Practical implementation of this scheme revealed a 
number of problems. It was found that, in order to 
achieve the target lock time of 1 s, the AFC loop gain 
had to be increased to a very high value. At this high 
AFC gain, the phase noise contributed by the AFC loop 
often prevented the inner loop from locking. More- 
over, file AFC loop gain would undergo damped oscil- 
lation around its steady state value for an unacceptable 
length of time before the frequency uncertainty settled 
down to within the 100-Hz capture range of the inner 
loop. This meant that the decision to switch the AFC 
gain to a low value could not be based on a lock detec- 
tor operating on the inner loop. The AFC loop would 
have to autonomously switch gain, based on some mea- 
surement of its own state. 
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Autonomous AFC Gain Switching 

The key requirement is for the AFC loop to deter- 
mine that it is sufficiently close to its steady state value. 
The time response of the AFC loop’s error signal, eAFc> 
to a step change in frequency, for large AFC gains, has 
the characteristic underdamped shape shown in Figure 
6 . 


AFC lock 



Figure 6. Typical AFC error signal response without 
noise (artist’s impression) 

It is clear that a change in sign of the derivative of 
eAFc (noise assumed to be absent) indicates that eAPC 
has just traversed its first peak. Usually, this point is 
sufficiently close to the steady state value. Thus, a 
change is sign of the first derivative of cafc> say eAFc\ 
may be taken as the signal to clamp down the AFC 
gain. Figure 6 shows this conceptually. 

When noise is present, this approach is not fool- 
proof as noise can cause premature sign changes in 
cafc’* The following remedy was applied to this prob- 
lem. eAFc was filtered by a 1.6 Hz bandwidth filter be- 
fore its derivative was taken. However, this measure, 
by itself, could not eliminate all occurrences of spurious 
AFC lock indication. Thus a waiting time of 0.8 s was 
introduced on each occurrence of AFC lock. During 
this time, the AFC gain would be clamped down to its 
LOW value. If, at the end of this period, the inner loop 
still indicated no lock, the AFC gain would be returned 
to its HIGH value. The waiting time was selected to be 
0.8 s as this was the acquisition time of the inner loop. 

Special Accommodation for Fading and Blocked 
Channels 

Some customization of the above concepts were in- 
corporated to further improve performance in fading 
and blocked channels. Instead of two AFC gains, three 
gain values were used — HIGH, MEDIUM and LOW. 

High AFC gain was used on "initial search" for the 
carrier. The "initial search" condition was defined to 
exist on power up and on changing the receiver’s tuned 
frequency. If, after once acquiring the carrier, it was 
lost (presumably due to fading or blockage) then the 
MEDIUM gain was applied. The use of a medium gain 


ensured rapid resynchronization when the carrier re- 
turned from a fade or blockage. As it had been gone 
only for a short period, the frequency could not have 
changed very much. If the inner loop remained contin- 
uously out of lock for more than 25 s, the "initial search" 
condition was declared to exist, on the assumption that 
significant frequency change might have occurred in the 
intervening period (due to Doppler or oscillator drift). 
The LOW gain was applied only when the inner loop 
was locked. 

DEMODULATOR PERFORMANCE RESULTS 

The "proof of the pudding" for the above techniques 
is in meeting the SDM PER requirements and the de- 
rived requirement of 1-s carrier acquisition time. Fig- 
ures 7 and 8 show the PER performance of the Rock- 
well LMES demodulator in the SDM fading channel. 



Figure 7. Fading Channel Performance 


SIMULATION RESULTS FOR SHADOWED CHANNEL 



3.00 3.50 4.00 4.50 5.00 5.50 6.00 

Es/No (dB) 


Figure 8. Blocked Channel Performance 

It is clear that Inmarsat-C performance requirements 
are satisfied. 
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The statistics of the carrier acquisition time, for 850 
Hz offset, in a fading channel with unfaded C/No = 34 
dBHz, is given below (simulation results). 

Carrier Acquisition Statistics 

Mean carrier acquisition time: 1.1 s 

Median carrier acquisition time: 0.9 s 

90-percentile carrier acquisition time: 1.8 s 

Since the other major aim of the design was cost 
minimization, the outcome of that effort is noted below. 

The processing speed requirement of the Demodu- 
lator part of the algorithm is 4.8 MIPS, while that for 
the entire DSP subsystem is 5.7 MIPS. The program 
code size is approximately 1.5K. The implementation 
was based on one Analog Devices ADSP 2101 chip, for 
which the quantity- 10K unit price is approximately $38. 
If the program memory size could be reduced to under 
IK, the ADSP 2105 chip could be used at the quantity- 
10K unit price of $20. This is considered feasible by 
additional innovations in code optimization and is 
planned for future product revisions. 

The entire cost of the DSP subsystem, including ex- 
ternal memory, A/D and other sampling circuit Com- 
ponents, is under $60. 

SUMMARY 

When addressing a mass market, it is important to 
minimize product cost while keeping product perfor- 
mance above a defined level of acceptability. While the 
costs of DSP parts have been falling in general, a sig- 
nificant difference still exists between the "low-end" 
parts with modest MIPS and on-chip memory, and the 
higher end parts featuring greater DSP resources. In 
this paper, some novel input sampling techniques and 
DSP algorithms are presented which helped to realize 
an Inmarsat-C demodulator using low-end parts. 
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